Understanding Your Dataset
About Your Response Dataset
For additional analysis outside of Qualtrics, you can download a dataset file for any survey. This dataset includes all your survey’s raw response data, which covers everything from answers to survey questions, to additional metadata like duration and dates, to embedded data, and more.
On this page, we will:
- Explain different columns of information you might see in your data file.
- Explain how answers may be formatted in this data file, depending on the column.
- Link out to other resources that allow you to understand or customize your data download.
However, this page does not discuss statistical analysis, or tell you how to interpret the results of your data, beyond the literal (e.g., this respondent marked that they were highly satisfied). There are so many different variables and projects that go into research, and while we’d love to say we know everything, it really depends on the specifics of your study and how you set it up!
Types of Response Datasets Covered By This Page
This page can help you understand the raw data you export from the following types of projects:
There are a few other types of projects where you can export response data. However, there are important differences to keep in mind:
- For 360 projects, see Understanding Your Response Dataset (360).
- For all other Employee Experience projects, see Understanding Your Response Dataset (EX).
Technically, Conjoint and MaxDiff datasets are also formatted as described on this page when exported from Data & Analysis. However, this data export excludes Conjoint- and MaxDiff-specific data.
File Format Basics
Every row of the file is a different survey response (although not necessarily different respondents, if you let people answer multiple times). Every column is a type of survey data.
CSV and TSV files come with 3 rows of headers. The first header is the internal Qualtrics ID of the field (e.g., EndDate, Q1, Q2, and so on). The second header is the field’s name or text (e.g., End Date, How satisfied are you with Qualtrics?). The third header has import IDs. All 3 of these headers are included because they are needed to upload the data to a survey. Respondent data starts on the fourth row of the file.
Respondent Information
The first several columns in a dataset include information about each respondent and their response, such as their name, IP address, response submission dates, and so on. We’ll list each of those columns and how to understand their content here.
StartDate
These date and time values indicate when the respondents first clicked the survey link.
EndDate
These date and time values indicate when the respondent submitted their survey. If the entry is an incomplete response, this date will indicate the last time the respondent interacted with the survey.
Status
The value in the Status column indicates the type of response collected. These are the possible statuses, presented in both numeric and choice text format:
- 0 / IP Address: A normal response
- 1 / Survey Preview: A preview response
- 2 / Survey Test: A test response
- 4 / Imported: An imported response
- 16 / Offline: A Qualtrics Offline App response
- 17 / Offline Preview: Previews submitted through the Qualtrics Offline App. This feature is deprecated in latest versions of the app
IPAddress
This column includes the respondent’s IP address. This data will not be available if responses have been completely anonymized.
Duration
The number of seconds it took the respondent to complete the survey. This is the entire duration of the response; if a respondent stops in the middle of the survey, closes the browser, and comes back another day, that time is counted.
Finished and Progress
The Finished column details whether the response was submitted or closed. A “1” or “TRUE” indicates the respondent reached an end point in their survey (hitting the last Next/Submit button, being screened-out with Skip or Branch Logic, etc.). A “0” or “FALSE” indicates the respondent left their survey before reaching an end point and the response was instead closed manually or due to session expiration.
The Progress column shows the progress a respondent made in the survey before finishing. For those marked as “1” or “TRUE” in the Finished column, the Progress is marked 100, regardless of whether they were screened out. For those whose responses are marked “0” or “FALSE,” you will get an exact percentage of how far they got in the survey based on what question they left off on.
RecordedDate
This column indicates when a survey was recorded in Qualtrics. For users taking surveys online, this date and time will be very similar to End Date. However, for responses that are imported, or uploaded from the Offline App, Recorded Date will often differ from End Date, reflecting when you manually uploaded the results, not when the survey taker finished.
Qtip: Noticing a difference of several minutes between End Date and Recorded Date? A slow internet connection can delay the time between when the respondent submits their survey data and when Qualtrics officially saves it to the website.
ResponseID
The ResponseID is the ID Qualtrics uses to identify each response in the database. This unique identifier is provided as a reference and generally does not have a use in data analysis.
RecipientLastName, RecipientFirstName, and RecipientEmail
Respondents’ names and email addresses will display in these columns if your survey was distributed using a contact list. Some of the common distributions that use contact lists include:
For all other responses, such as those collected with an anonymous link or with certain Survey Options enabled, these columns will be blank. Please note that any distribution can be anonymized, regardless of the method of distribution.
ExternalDataReference
Frequently used when uploading a contact list to Qualtrics for use in an authenticator and occasionally in an email distribution, an external data reference can be included for each participant. This is a generic field that can store any information you like (and is most often used for unique identifiers like employee or student IDs). If an external data reference was added to the contact list, it will display in this column. If you did not choose to use this field, the column will be blank.
LocationLatitude and LocationLongitude
If the respondent completed the survey using the Qualtrics Offline App on a GPS-enabled device, this data will be an accurate representation of the respondent’s location.
For all other respondents, the location is an approximation determined by comparing the participant’s IP address to a location database. Inside the United States, this data is typically accurate to the city level. Outside the United States, this data is typically only accurate to the country level.
This data will not be available if responses have been completely anonymized.
DistributionChannel
This column describes the method of survey distribution.
In the above example, the survey was emailed to participants.
UserLanguage
If your survey has multiple languages, the respondent’s language code will be displayed in this column.
Even if a survey has only 1 language, every response should have data in the UserLanguage column, including previews. The only exception is test responses, which will have a blank UserLanguage column.
Question Responses
The next columns in the dataset display the answers provided for each survey question. Columns are headed with the question numbers (e.g., Q1) and then the beginning lines of the question text (e.g., How easy was it to understand the reading assignment?).
Simple questions (text entry, multiple choice – 1 answer, etc.) will be contained in 1 column, but more complex questions with multiple statements (matrix table, side by side, etc.) will be spread across multiple columns.
By default, data is downloaded in choice text (i.e., the exact text of questions and answers). However, you can also choose to export it with numeric values (called “recode values“) for each answer choice. For example, on a 5 point scale, “Strongly Agree” would display as a “5”, making it easier to find a mean or do other statistical analysis.
If the numeric coding of your choices doesn’t match your expectations, you can always return to the Survey tab to change them and then export your data again. You can also export your survey to Word to retrieve a code book outlining how each choice is coded in the dataset.
Guide to Specific Question Types
The exported data will often look different based on the question types you chose to include. These differences are explained on the question’s individual page. We have linked to the relevant sections below.
- Multiple choice (Including how multiple-answer exports appear)
- Matrix table
- Text entry
- Form field
- Slider
- Rank order
- Side by side
- Constant sum
- Pick, group, & rank
- Hot spot
- Heat map
- Graphic slider
- Drill down
- Net promoter® score
- Highlight
- Signature
- Timing
- Meta info
- File upload
Questions in Loop & Merge Blocks
When viewing your data, each loop is treated as a separate set of questions. If you have 5 possible loops, you will see the looped questions repeated 5 times in the data. Even if a respondent isn’t shown all of the loops, all possible loops will be represented in the data.
Scoring Results
For surveys using Scoring, the score for each category is included in the dataset. Each scoring category gets its own column of data. In the example below, the survey only had 1 scoring category, called “Readability.”
This score is a sum of the points the respondent earned in the category, not an average.
The “SC” in the header stands for “Scoring Category,” and the number represents the number category it is, counting up from zero. Because the example survey above has 1 scoring category, we see SC0, but if there were more, we would see SC1, SC2, and so on.
Embedded Data
For surveys using embedded data, embedded data information is included in the columns following scoring information.
Only embedded data fields saved in the survey flow are included in the downloaded dataset. Embedded data fields with values from a contact list or URL can be saved to the survey flow at any point before or after data collection.
Randomization Data
You will see randomization columns following your question response data. There will be a column for each randomized element in the survey. For example, if you randomized a block with 5 questions in it, you’ll have 5 columns, 1 for each question. If you had a randomizer in your survey flow with 7 elements under it, you’d have 7 columns, 1 for each element that was randomized.
If you randomize the order of all elements presented, the column will show what order that element was presented, e.g., 1, 2, 3, and so on.
Example: In the screenshot below, the order of the questions was randomized and the columns show the order each question appeared in the sequence. Note that the question numbers are in the headers.
When you randomly present 1 element out of a list of several, elements that were displayed will be marked as 1. Elements that were not displayed to the respondent will be blank.
Example: In the example below, 1 element was randomly presented from a list of 3. The columns indicate which element was shown to the respondent by placing a 1 beneath the column labeled for the element presented.
Qtip: Confused about how to read the headers in the second example? Because the survey flow randomizer was used in this example, the column displays flow IDs instead of Question IDs. You can retrieve flow IDs by going to your survey flow and selecting Show flow IDs in the top-right corner. Flow IDs cannot be edited.
Troubleshooting a Data File
This section will go over some common questions and concerns that many users have regarding their data file. We’ll also highlight some helpful features you can use to customize your data exports.
Guide to Exporting Data & All the Options Available
For instructions on how to export data, see the following support pages:
- Exporting Response Data: Step-by-step instructions and tips.
- Data Export Options: Guide to additional options, such as exporting randomized data, labelling seen but unanswered questions, exporting in numeric vs. text format, and so on.
- Data Export Formats: Guide to different file types you can export.
Customizing the Columns Included in the Export
To customize the columns in your exports,
- Choose columns you want to export, and deselect the columns you don’t.
- Export your data with the option called Download all fields deselected.
Exporting Filtered Data
To export filtered data,
Features That Do Not Have Data to Include in an Export
If you have included a descriptive text (such as an introduction paragraph with no attached question) or a graphic (such as a picture with no attached question), these fields will not have their own columns in the data export, since they have no answers for the respondent to select. If you noticed question numbers being “skipped” in your survey export, it may be because you have fields like this included.
However, if you randomized when descriptive text or graphic fields are displayed to respondents, you can find this data by exporting the randomized data. See Exporting Randomized Data for step-by-step instructions, and see Randomization Data for a guide on how to read the output.
Certain Answer Choices Excluded From File / How to Exclude Values from Analysis
See Exclude from Analysis. Some values, based on how they are worded, are excluded by default. These responses are recorded and can be added back to the data at any time with no issue.
Embedded Data Excluded From File
Make sure the embedded data element is added to the survey flow and is pulling in all contact fields.
For deeper troubleshooting with embedded data, please see the linked support pages.
Other Fields Excluded From File
Make sure you have Download all fields selected when you export your data. Randomized data is not included by default, but can be added according to these instructions.
See additional export options.
Customize Question Numbers
See Auto-Numbering Questions and Question Numbers.
Customize the Question or Answer Wording in the Data Export, but not the Survey
You can use question labels to change how questions themselves are worded in the export, without affecting how the questions appear to survey respondents. You can use variable names to change the wording of answer choices. It is safe to edit question labels and variable names at any time during survey collection.
Question labels and variable names also affect how this data appears in results and reports.
Wording of Questions / Answers Differs from Survey Editor
If the wording of questions is different from what you see in the survey editor, double-check that there are no question labels added. If the wording of your answers differs from what you see in the survey editor, check your variable names under recode options. If you created your survey from a copy of an older version, these settings can be carried over. It is safe to edit question labels and variable names at any time during survey collection.
CSV Export Issues
If your CSV export doesn’t look right – e.g., has symbols instead of the expected text or has columns running into each other – export the data in TSV format instead. TSV is especially useful for data that contains special characters.
For additional troubleshooting, see Trouble with Downloaded CSV Files.
Response Files from 360, Engagement, Lifecycle, and Ad Hoc Employee Research Projects
For 360, see Understanding Your Response Dataset (360).
For all other EX projects, see Understanding Your Response Dataset (EX).
File Format Differences
Though all file types download the same data fields described above, each features a layout that may be slightly different.
SPSS
The Data View in SPSS includes the exact same layout as the CSV file, with fewer headers.
SPSS includes another view, called the Variable View. This view lists all the variables from your dataset with information about each, such as the variable type and the possible values.
XML
The XML file type is often used when integrating Qualtrics data with a third-party database. This file type can be parsed easily by common database software.
An XML element is provided for each response, with a child element for each piece of data stored in that response.